Evaluating Commercial MT Systems
نویسندگان
چکیده
Vendors of commercial machine translation systems will often claim that their system can increase translator productivity x-fold. In order to verify such claims, we need to answer the following two questions: First, how is translator productivity generally measured? And second, precisely how does one go about comparing human translator (HT) productivity with MT productivity? The answer to the first question is relatively straightforward, at least for translators that are part of a translation service: productivity is generally measured in terms of the number of words translated per unit of time. In fact, translators frequently have to meet production quotas – 1300 words per day, for example – and their promotion may be contingent upon producing a certain number of words per year. The answer to the second question is slightly more complicated and involves, I would suggest, the comparison of two production chains: one in which the human translator works in tandem with the MT system; and another in which he works alone, without the aid of the system. Now there are many ways for a human translator to actually produce his texts: he can write them out, or type them, dictate them or use a word processor. Most commercial MT systems, on the other hand, come bundled (or at least interface with) a word processor. My intuition six years ago, when I was asked to participate in a trial of the Weidner MicroCat system at the Canadian government’s Translation Bureau, was that the purported productivity gains reported by the vendor were at least partly attributable to the introduction of a word processor in place of more traditional modes of production. Be that as it may, it is surely important, when designing an MT trial, to attempt to isolate the contribution of the machine translation module to overall productivity, since this is what costs the most to develop and what justifies the hefty price tag, not the word processor.
منابع مشابه
Probing the Lexicon in Evaluating Commercial MT Systems
In the past the evaluation of machine translation systems has focused on single system evaluations because there were only few systems available. But now there are several commercial systems for the same language pair. This requires new methods of comparative evaluation. In the paper we propose a black-box method for comparing the lexical coverage of MT systems. The method is based on lists of ...
متن کاملProbing the lexicon in evaluating commercial MT systems
In the past the evaluation of machine translation systems has focused on single system evaluations because there were only few systems available. But now there are several commercial systems for the same language pair. This requires new methods of comparative evaluation. In the paper we propose a black-box method for comparing the lexical coverage of MT systems. The method is based on lists of ...
متن کاملDesigning A Mixed System of Network DEA for Evaluating the Efficiency of Branches of Commercial Banks in Iran
One of the most important applications of data envelopment analysis tech-nique is measuring the efficiency of bank branches. Performance measure-ment in the banking industry is important for several groups, including bank managers, customers, investors, and shareholders. The purpose of this study is to examine and design a mixed structure to measure the efficiency of branches of Iranian banks a...
متن کاملA fuzzier approach to machine translation evaluation: A pilot study on post-editing productivity and automated metrics in commercial settings
Machine Translation (MT) quality is typically assessed using automatic evaluation metrics such as BLEU and TER. Despite being generally used in the industry for evaluating the usefulness of Translation Memory (TM) matches based on text similarity, fuzzy match values are not as widely used for this purpose in MT evaluation. We designed an experiment to test if this fuzzy score applied to MT outp...
متن کاملAfter Linguistics - based MT
The 80s can be characterized as the era of Linguistics-based MT (LBMT) and of its failure in the history of MT, in which (computational) linguists have initiated the first serious attempt at constructing scientific or computational theories of MT. Partly because of a large discrepancy between scientific interests and engineering practices, this work has little influence on the performance of co...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007